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ABSTRACT 

We evaluate two vector quantizer designs for compression of multispectral imagery and their impact 
on terrain categorization performance. The mean-squared error (MSE) and classification perfor- 
mance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing 
MSE subject to a constraint on classification performance has a significantly better classification 
performance than a standard MSE-based tree-structured vector quantizer followed by maximum- 
likelihood classification. This improvement in classification performance is obtained with minimal 
loss in MSE performance. Our results show that it is advantageous to tailor compression algorithm 
designs to the required data exploitation tasks. Applications of joint compression/classification 
include compression for the archival or transmission of Landsat imagery that is later used for land 
utility surveys and/or radiometric analysis. 



1 Introduction 

The vast majority of vector quantizer (VQ) design algorithms presume the use of mean- 
squared error (MSE) as a metric. The shortcomings of MSE on perceptual quality in im- 
age coding are well known. Iil this paper, we show that MSE-based quantization severely 
degrades the performance of M-ary classification algorithms following compression and de- 
compression. Appropriate design criteria for the joint compression and classification problem 
should include some combination of MSE and Bayes risk. In the context of multispectral 
imagery, MSE is a reasonable criterion for quantizers that are designed to preserve the root 
mean-squared (RMS) radiometric accuracy of the imagery. Bayes risk, on the other hand, 
is appropriate for designs that optimize terrain categorization performance, since it directly 
relates to classification performance. 

We explore two vector quantizer designs, an independent design and a joint design. The 
independent design uses a standard MSE-based tree-structured vector quantizer (TSVQ) 
followed by a maximum-likelihood classifier that optimizes probability of correct classification 
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[5], The joint design, on the other hand, optimizes MSE performance subject to a constraint 
on classification performance. For this latter design, a two-stage quantizer is used [6,7]. The 
first quantization stage is a tree-structured classifier (TSC) [1,2] that essentially performs a 
coarse quantization of the multispectral pixel feature space. This coarse quantizer is then 
refined using a second quantizer that is designed using a MSE criterion. An alternative to 
the joint compression/classification problem has recently been proposed by Cosman et. al. 

[3]. 

We present results on the MSE and terrain categorization performance of these two 
quantizer designs at various information compression rates for Landsat-4 Thematic Mapper 
data collected over Ann Arbor, Michigan are presented. Empirical results indicate that the 
joint design provides superior classification performance with minimal MSE degradation. 

2 Results 

We demonstrate that for MSE-based TSVQ codebook designs having large or even moderate 
compression ratios of 8:1 or better, classification performance on compressed imagery is 
severly degraded relative to the performance of the classical maximum-likelihood classifier 
operating on uncompressed imagery. This performance degradation is due to the fact that 
at high compression ratios (that is, low code rates), there is a tendency for classes having 
large component variances to mask other classes that have smaller variances — even when the 
classes are well separated. This is because the MSE criterion protects against large errors 
regardless of the resulting classification performance. 

Figure 1 shows a scatter plot from two bands of Landsat-4 multispectral data for a simple 
four-class problem; band 5 radiances are plotted against the corresponding band 3 radiances 
for four terrain categories: clouds, soil, water and wetlands. Two different algorithms were 
used to partition the scatter plot into four regions. The partition selected by an MSE-based 
TSVQ is shown in solid lines while the partition selected by a tree-structured classifier is 
shown in dashed lines. Also shown in Figure 1 are the corresponding codewords: each data 
point falling into a given partition element is represented by the codeword for that partition 
element. 

In Figure 1, the large- variance class (clouds) is “over coded.” In the MSE-based partition, 
the soil and wetland classes are not distinguished since they fall into a single partition 
element. In this case, compression of the data with the TSVQ would result in a loss of 
classification performance. Nonetheless, the four classes are well separated and a classifier 
partition can be designed to separate all four classes. Indeed, the classifier partition allows 
each of the four terrain categories to be distinguished. 

The independent and joint compression/terrain categorization designs were applied to 
the six reflective bands from a 185x185 km^ Landsat-4 frame collected over the southeast 
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Figure 1 . Two feature-space partitions for the four-class terrain categorization 

example. 
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Michigan area. A total of 10 general terrain classes: urban, agricultural, bare soil, range, 
deciduous, conifer, water, barren and cloud covered were located and identified by an expe- 
rienced image interpreter. Figure 2 shows a quantitative performance comparison between 
the independent and joint design approaches. Specifically, Figure 2 shows both the MSE 
and classification error rate as a function of the code rate. The classification error rate is 
computed with respect to the terrain categorization performance on the original data. The 
various curves in Figure 2 show the performance of four VQ designs: the independent design, 
and joint designs in which the first stage (i.e., the classifier) is allocated 5, 6, and 8 bits. 
The original data rate is 48 bits per pixel (bpp) (i.e., six bands at 8 bits/band/pixel). The 
plots show that a substantial rate decrease can be achieved while still retaining the same 
classification error rate. In particular, at a 4:1 compression rate, or 12 bpp, the joint scheme 
has a 4% RMS radiometric error and a 2% classification error. This should be compared to 
the independent scheme which has a slightly lower RMS radiometric error of 0.5%, but a 
significantly larger classification error of 25%. 

Finally, Figure 3 shows the output of the terrain categorization step after compression 
at a 12:1 compression ratio (i.e., a data rate of 4 bits/multispectral pixel). Figure 3a shows 
the' original classification output. Figure 3b shows the output of the independent compres- 
sion/classification design (i.e., the supervised maximum-likelihood classifier operating on 
data that has been compressed 12:1 with a MSE-based tree-structured vector quantizer). 
Figure 3c shows the output of the joint compression/classification design. In the indepen- 
dent design, the water category is classified as a non-category, while many of the other 
classes are missing completely. On the other hand, in the joint design much of the original 
spatial structure in the classification map is preserved and the classification errors are spa- 
tially localized. In fact, when we examined the difference between the joint design output in 
Figure 3c and the original classifier output in Figure 3a, we found that approximately 93% 
of the classification errors occurred over regions that were 3x3 pixels across or smaller. 

3 Conclusions 

We compared two quantizer designs for the problem of joint compression/terrain categoriza- 
tion of multispectral imagery. The first quantizer design was an independent design, consist- 
ing of a mean-squared error (MSE) based quantizer design followed by a maximum-likelihood 
classifier. The second design was a joint design that employed a two-stage quantizer. The 
first stage consisted of a tree-structured classifier that performed a coarse quantization of 
the image data. This coarse quantization was then refined using a standard MSE-based 
tree-structured quantizer. One can view this two-stage process as one particular approach 
to minimizing MSE subject to a constraint on allowable classification error. 

We showed that the joint design achieved a significant improvement in classification per- 
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Classification Error vs. Code Rate 



Figure 2. MSE and Classification performance of the two compression/ 

classification schemes 
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(a) Original Classification 



(c)TSC, 12:1 Compression 


(b) ML Classifier, 12:1 Compression 
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Figure 3. Joint compression/terrain categorization examples: (a) Original terrain categorization, 
(b) independent compression/classification designs, (c) joint classification/compression design. 
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formance with only a minor degradation in MSE performance. This suggests that significant 
increases in data exploitation utility can be achieved by modifying compression algorithm 
design criteria to include metrics appropriate to the required exploitation tasks. 
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